Sampled fictitious play for multi-action stochastic dynamic programs

نویسندگان

Archis Ghate

Shih-Fen CHENG

Stephen Baumert

Daniel Reaume

Dushyant Sharma

Robert L. Smith

Shih-Fen Cheng

چکیده

We introduce a class of finite-horizon dynamic optimization problems that we call multiaction stochastic dynamic programs (DPs). Their distinguishing feature is that the decision in each state is a multi-dimensional vector. These problems can in principle be solved using Bellman’s backward recursion. However, complexity of this procedure grows exponentially in the dimension of the decision vectors. This is called the curse of action-space dimensionality. To overcome this computational challenge, we propose an approximation algorithm rooted in the game theoretic paradigm of Sampled Fictitious Play (SFP). SFP solves a sequence of DPs with a one-dimensional action-space, which are exponentially smaller than the original multi-action stochastic DP. In particular, the computational effort in a fixed number of SFP iterations is linear in the dimension of the decision vectors. We show that the sequence of SFP iterates converges to a local optimum, and present a numerical case study in manufacturing where SFP is able to find solutions with objective values within 1% of the optimal objective value hundreds of times faster than the time taken by backward recursion. In this case study, SFP solutions are also better by a statistically significant margin than those found by a one-step lookahead heuristic. ∗Corresponding author. Industrial and Systems Engineering, Box 352650, The University of Washington, Seattle, WA 98195. Email: [email protected].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sampled fictitious play for approximate dynamic programming

Sampled Fictitious Play (SFP) is a recently proposed iterative learning mechanism for computing Nash equilibria of non-cooperative games. For games of identical interests, every limit point of the sequence of mixed strategies induced by the empirical frequencies of best response actions that players in SFP play is a Nash equilibrium. Because discrete optimization problems can be viewed as games...

متن کامل

A Computationally Efficient Implementation of Fictitious Play for Large-Scale Games

The paper is concerned with distributed learning and optimization in large-scale settings. The wellknown Fictitious Play (FP) algorithm has been shown to achieve Nash equilibrium learning in certain classes of multi-agent games. However, FP can be computationally difficult to implement when the number of players is large. Sampled FP is a variant of FP that mitigates the computational difficulti...

متن کامل

Sampled Fictitious Play for Black-Box Stochastic Sequential Decision Problems

In this paper, we propose an algorithm based on Sampled Fictitious Play for solving finitehorizon stochastic sequential decision problems. Our method models the decision problem as a game of identical interest between multiple players, who use the history of their past plays to improve the estimate of optimal reward in the initial state. We show that this method is able to find an optimal polic...

متن کامل

Stochastic fictitious play with continuous action sets

Continuous action space games form a natural extension to normal form games with finite action sets. However, whilst learning dynamics in normal form games are now well studied, it is not until recently that their continuous action space counterparts have been examined. We extend stochastic fictitious play to the continuous action space framework. In normal form games the limiting behaviour of ...

متن کامل

Sampled Fictitious Play is Hannan Consistent

Fictitious play is a simple and widely studied adaptive heuristic for playing repeated games. It is well known that fictitious play fails to be Hannan consistent. Several variants of fictitious play including regret matching, generalized regret matching and smooth fictitious play, are known to be Hannan consistent. In this note, we consider sampled fictitious play: at each round, the player sam...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Sampled fictitious play for multi-action stochastic dynamic programs

نویسندگان

چکیده

منابع مشابه

Sampled fictitious play for approximate dynamic programming

A Computationally Efficient Implementation of Fictitious Play for Large-Scale Games

Sampled Fictitious Play for Black-Box Stochastic Sequential Decision Problems

Stochastic fictitious play with continuous action sets

Sampled Fictitious Play is Hannan Consistent

عنوان ژورنال:

اشتراک گذاری